Regret of Age-of-Information Bandits
نویسندگان
چکیده
We consider a system with single source that measures/tracks time-varying quantity and periodically attempts to report these measurements monitoring station. Each update from the has be scheduled on one of $K$ available communication channels. The probability success each attempted is function channel used. This unknown scheduler. metric interest Age-of-Information (AoI), formally defined as time elapsed since destination received recent most source. model our scheduling problem variant multi-arm bandit channels arms. characterize lower bound AoI regret achievable by any policy performance UCB, Thompson Sampling, their variants. Our analytical results show UCB sampling are order-optimal for bandits. In addition, we propose novel policies which, unlike use current make decisions. Via simulations, proposed AoI-aware outperform existing AoI-agnostic policies.
منابع مشابه
Regret of Queueing Bandits
We consider a variant of the multiarmed bandit problem where jobs queue for service, and service rates of different servers may be unknown. We study algorithms that minimize queue-regret: the (expected) difference between the queue-lengths obtained by the algorithm, and those obtained by a “genie”-aided matching algorithm that knows exact service rates. A naive view of this problem would sugges...
متن کاملDueling Bandits with Weak Regret
We consider online content recommendation with implicit feedback through pairwise comparisons, formalized as the so-called dueling bandit problem. We study the dueling bandit problem in the Condorcet winner setting, and consider two notions of regret: the more well-studied strong regret, which is 0 only when both arms pulled are the Condorcet winner; and the less well-studied weak regret, which...
متن کاملanalysis of reading comprehension needs of the students of paramedical studies: the case of the students of health information management (him)
چکیده ندارد.
15 صفحه اولBayesian Bandits, Secretaries, and Vanishing Computational Regret
We consider the finite-horizon multi-armed bandit problem under the standard stochastic assumption of independent priors over the reward distributions of the arms. We define a new notion of computational regret against the Bayesian optimum solution instead of worst-case against the true underlying distributions. We show that when the priors of the arms satisfy a log-concavity condition, there i...
متن کاملRegret Bounds for Deterministic Gaussian Process Bandits
This paper analyzes the problem of Gaussian process (GP) bandits with deterministic observations. The analysis uses a branch and bound algorithm that is related to the UCB algorithm of (Srinivas et al., 2010). For GPs with Gaussian observation noise, with variance strictly greater than zero, (Srinivas et al., 2010) proved that the regret vanishes at the approximate rate of O ( 1 √ t ) , where t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Communications
سال: 2022
ISSN: ['1558-0857', '0090-6778']
DOI: https://doi.org/10.1109/tcomm.2021.3118037